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O , Abstract 

' Decentralized optimization of distributed stochastic differential systems has been an active area of 

research for over half a century. Its formulation utilizing static team and person-by-person optimality 
criteria is well investigated. However, the results have not been generalized to nonlinear distributed 

■ stochastic differential systems possibly due to technical difficulties inherent with decentralized decision 
IjO ' strategies. 

, In this first part of the two-part paper, we derive team optimahty and person-by-person optimality 

i conditions for distributed stochastic differential systems with different information structures. The op- 

O 

■ timality conditions are given in terms of a Hamiltonian system of equations described by a system of 
coupled backward and forward stochastic differential equations and a conditional Hamiltonian, under 
both regular and relaxed strategies. Our methodology is based on the semi martingale representation 

^ . theorem and variational methods. Throughout the presentation we discuss similarities to optimality 

" ■ ■ conditions of centralized decision making. 
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1. Introduction 

Over the last 50 years many mathematical concepts and procedures were developed to design 
optimal control strategies for stochastic dynamical systems. We refer to this set of mathematical 
concepts and procedures as the "classical theory of stochastic optimization". It has been uti- 
lized extensively to address the questions of existence of optimal strategies, and necessary and 
sufficient optimality conditions for systems driven by continuous martingale processes (Brow- 
nian motion processes), and discontinuous martingale processes (jump processes). It has been 
successfully applied to centralized fully observable control problems, meaning the admissible 
strategies are functions of a common noiseless measurements of the system [[Il-[l9l, and to 
centralized partially observable control systems, meaning the admissible strategies are functions 
of common noisy measurements of the system [|2l, [[T0l - [[T3l . In addition, optimility conditions 
are derived for infinite dimensional systems and impulsive systems in dU, |l9l, lfT4l . Thus, the 
classical theory of optimization is developed on the assumption of centralized decisions or control 
actions. It presupposes that all information about the system can be acquired and accordingly the 
decision policies (control actions) can be formulated. The basic underlying assumption is that 
the acquisition of the information is centralized or the information acquired at different locations 
is communicated to each decision maker or control. 

When the system model consists of multiple decision makers, and the acquisition of in- 
formation and its processing is decentralized or shared among several locations, the decision 
makers actions are based on different information. We call the information available for such 
decisions, "decentralized information structures or patterns". When the system model is dynamic, 
consisting of an interconnection of at least two subsystems, and the decisions are based on 
decentralized information structures, we call the overall system a "distributed system with 
decentralized information structures". Over the years several specific forms of decentralized 
information structures are analyzed mostly in discrete-time [[T5l - [|26l . and more recently (Zf]- 
[|32l . However, at this stage there is no systematic framework addressing optimality conditions for 
distributed systems with decentralized information structures. The absence of such optimization 
theory raises the question whether the classical theory of optimization is limited in mathematical 
concepts and procedures to deal with decentralized systems. 
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Fig. 1. Diagram of arcliitecture for distributed stochastic differential decision systems. 



In this first part of the two-part investigation, we show that the classical theory of optimization 
does not have such a limitation. We consider a team game reward [|23l . ||26l . Il33l - ll35l and 
we apply concepts from the classical theory of optimization to derive necessary and sufficient 
optimality conditions for nonlinear stochastic distributed systems with decentralized information 
structures. Our methodology utilizes the semi martingale representation theorem and variational 
methods recently reported by the authors in [|36l . 

The optimality conditions developed in this paper can be applied to many architectures of 
distributed systems such as Fig. [T] (see also [|37l '). Each decision maker makes its decision based 
on local information and exerts control action that affects the overall distributed system, without 
allowing communication between the local decision makers. Such systems are called distributed 
systems with decentralized information structures. The team formulation of the distributed system 
with decentralized information structures, consists of an interconnection of N subsystems. Each 
subsystem i has its state denoted by G X*, a local decision maker or control input m' g A*, an 
exogenous Brownian motion noise input G W*, and a coupling from the other subsystem. 

Decentralized Information Structures for Decision Makers 

The information structures of the local decision makers u\i = 1,2,...,N are defined as 
follows. For any t G [0,T], the information structure available to decision maker (DM) is 
modeled by the cr— algebra Qq^ generated by the observable events associated with the local 
subsystem. These observables can be generated by nonanticipative functionals of the noise 
entering the system, nonanticipative functions of the state of the system, its delayed versions, or 
any possible combinations thereof. Let us denote the admissible strategies of with action spaces 
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A*, by I[J*[0, T], z = 1, 2, . . . , (meaning that is a nonanticipative measurable functional of 
the information algebra = {Got • ^ ^ [Oj^]} taking values from A*. Thus the augmented 
state, control and noise of the decentralized system can be written as 

Then the overall system can be expressed in compact form by the following stochastic Ito 
differential equation 

dx{t) = f{t,x{t),ut)dt + a{t,xt,Ut)dW{t), x(0) = xq, te(0,T]. (1) 
Team Game Pay-off Functional 

The objective is to find a team optimal strategy u° = {u^'°, . . . , u^'°) G x^^U*[0, T] at which 
the pay-off functional defined by 

Jiu") = J{u^'°,...,u^'°) = inf e\ [ i{t,x{t),u{t))dt + ip{x{T))} (2) 

(«i...«^)6x£,U'[o,r] I Jo ) 

attains its minimum. 

We consider two main classes of decentralized noiseless information structures; 1) nonantic- 
ipative functionals of any subset of the sybsy stems Brownian motions {W^^ . . . , VT^}, called 
"nonanticipative information structures", and 2) nonanticipative functionals of any subset of the 
subsystem states {x^, . . . ,x^}, called "feedback information structures" (see Section ITl-CI) . 

Team Game Optimality Conditions 

In Section |V] we derive team optimality conditions (Theorem |9]) for pay-off ^ subject to ©, 
under a strong formulation of the filtered probability space ^fi,F, {Fq,* : t G [0,T]},pj. These 
are summarized below. 
Define the Hamiltonian 

H : [0,T] X X(^) X X(^) X £(W(^),X^)) x A^^) ^ M 

by 

Hit, e, C, M, u) = {fit, e, u),0+ tr(MV(t, e, i^)) + iit, te [O, T]. (3) 
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For any u G U*^^^ = xfLiW[0, T], consider the adjoint process Q} and the state x satisfying 
the following backward and forward stochastic differential equations respectively, 

d^{t) = -n,{t, xit),ij{t), Q{t),ut)dt + Qit)dW{t), i,{T) = ^,ix{T)), t e [0, T), (4) 
dx{t) =n^{t,x{t),ij{t),Q{t),Ut)dt + a{t,x{t),Ut)dW{t), x{0) = Xo, t e {0,T]. (5) 

The stochastic optimality conditions of the team game with decentralized noiseless information 
structures are given below. 

(1) Necessary Conditions. Under certain conditions, which are precisely those of the 
classical theory of optimization, the following hold. 

For an element u° E V^^'^ = x^^U*[0,T] with the corresponding solution x" to be 
team optimal, it is necessary that the following hold: 

The process {'tp°, Q°} is the unique solution of the backward stochastic differential equa- 
tion dH) corresponding to the pair {u°, x°} and that they together satisfy the point wise 
almost sure inequalities with respect to the cr-algebras Got^ t E [0, T], i = 1, 2, . . . , : 

WeA\ a.e.tE[0,T], P|gj^-a.s., i = l,2,...,N. (6) 

(2) Sufficient Conditions. Under global convexity of the Hamiltonian with respect to the 
state and control variables and convexity of the terminal pay-off function Lp(-) the pair 
{x°{-),u°{-)} is optimal if it satisfies Q. 

An important feature obtained during the derivation is that the optimality conditions for a team 
optimal strategy are equivalent to the optimality conditions for a person-by-person optimal 
strategy. This follows from Theorem |6] and Corollary [T] 

The point to be made regarding the derivation of the above optimality conditions, is that 
we convert the problem into a centralized problem with the associated Hamiltonian system of 
equations to capture the constraints, and only at the final step, the optimality of decentralized 
strategies is addressed, by identifying the conditional variational Hamiltonian which is consistent 
with the decentralized information structures. That is, the Hamiltonian system dH), ([5]) is the one 
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corresponding to centralized strategies, while the conditional Hamiltonian ^ is the projection 
of the centralized Hamiltonian onto the subspace generated by the decentralized information 
structures. 

We conclude the preliminary discussion on classical optimization theory of centralized strate- 
gies versus decentralized strategies, by stating that there are no limitations in applying classical 
theory of optimization to distributed systems with decentralized information structures. Rather, 
the challenge is in the computation of the conditional Hamiltonians, and hence the optimal 
strategies. However, this has also remained a challenge for centralized fully or partially observed 
strategies. 

The specific objectives of this paper are the following. 

(a) Derive team games necessary conditions of optimality (stochastic maximum principle) 
for distributed stochastic differential systems with decentralized information structures. 

(b) Introduce assumptions so that the team games necessary conditions of optimality in (a) 
are also sufficient; 

(c) Derive person-by-person optimality conditions and discuss their relation with team 
optimality conditions; 

(d) Prove existence of optimal team and person-by-person strategies for distributed stochas- 
tic differential systems with decentralized information structures, using the theory of 
relaxed control strategies, and relate (a), (b), (c) to regular decision strategies. 

A detailed investigation of applications of the results of this part to specific linear and nonlinear 
distributed stochastic differential decision systems is discussed in the second part of this two-part 
paper [|38]| where we derive the explicit expressions for the optimal decentralized strategies. 

The rest of the paper is organized as follows. In Section HI] we formulate the distributed 
stochastic differential system with decentralized information structures. In Section Hill we con- 
sider the question of existence of optimal relaxed controls (decisions). In Section |IVl we develop 
the stochastic optimality conditions for team games with decentralized information structures, 
consisting of necessary and sufficient conditions of optimality. In Section |Vl we specialize the 
necessary and sufficient optimality conditions to regular strategies and obtain corresponding 
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necessary and sufficient optimality conditions. The paper is concluded with some comments on 
possible extensions of our results. 

II. Team Games of Stochastic Differential Systems 

In this section we introduce the mathematical formulation of distributed stochastic systems, the 
information structures available to the decision makers for their actions, and the definitions of 
collaborative decisions via team game optimality and person-by-person optimality. Throughout 
the terms "decision maker" or "control" are used interchangeably. A stochastic dynamical de- 
cision or control system is called distributed if it consists of an interconnection of at least two 
subsystems and decision makers. The underlying assumption for these distributed systems is 
that the decision makers actions are based on decentralized information structures. However, the 
decision makers are allowed to exchange information on their law or strategy deployed, e.g., the 
functional form of their strategies but not their actions. 

Some Basic Terminologies 

Abbreviation for "Decision Maker" 
subset of natural numbers 
set consisting of N elements 
set s minus {s*} 

linear transformation mapping a vector space X 
into a vector space 3^ 

ith column of a map A G £(M", W), i = l,...,n 
separable metric space for player i E actions 
product action space of N players 
regular admissible strategy of player i E 
relaxed admissible strategy of player i eZn 



DM 

Z^ = {l,2,...,iV} 

^ r 1 2 N^ 

s-^ = s\{s'}, s = {s-\s'') 

Ad) 
{A\ d) 

u;.,[o,T] 



Febmaiy 15, 2013 



DRAFT 



g 



Let ^r2,F, {Fo,t : t e [0,T]},Pj denote a complete filtered probability space satisfying the 
usual conditions ||39l . that is, (r2,F, P) is complete, Fo,o contains all P-nuU sets in F. Note that 
filtrations {Fq,* : t e [0,T]} are monotone in the sense that Fo,^ C Fq,*, VO < s < t < T. 
Moreover, {Fo,t : t e [0,T]} is called right continuous if Fo,t = Fo,t+ = ns>jFo,s,Vt G [0,T) 
and it is called left continuous if Fq^j = Fo,j_ = ^ Us<t ^o,s j 7 Vt G (0, T] . Throughout the paper 
filtrations are denoted by F^ = {Fq,* : t G [0, T]}, and they are assumed to be right continuous 
and complete. 

Consider a random process {z{t) : t G [0,T]} defined on the filtered probability space 
(f2,F, {Fo,t : t G [0,T]},P) and taking values in a metric space (Z, d). The process {z{t) : t G 
[0,T]} is said to be measurable if the map (t, w) — > z{t,uj) is B{[0,T]) x F/i3(Z)— measurable 
where i3(Z) denotes the Borel algebra of subsets of Z. The process {z{t) : t E [0, T]} is said to be 
{Fo,t : t G [0,T]}- adapted if for all t G [0,T], the map 00 z{t,uj) is Fo,t/i3(Z) -measurable. 
The process {z{t) : t E [0, T]} is said to be {Fo,t : t E [0, T]}— progresively measurable if for all 
t E [0, T], the map (s, u) — )■ z{s, u) is B{[0, t])Cg>Fo,t/i3(Z)— measurable. It can be shown that any 
stochastic process {z{t) : t E [0,T]} on a filtered probability space (fi,F, {Fq,* : t E [0,T]},P) 
which is measurable and adapted has a progressively measurable modification [[39|. Unless 
otherwise specified, we shall say a process {z{t) : t E [0,T]} is {Fq,* : t E [0, T]}— adapted if 
the processes is {Fq,* : t E [0, T]}— progressively measurable. 

In our derivations we make extensive use of the following spaces considered by the authors 

in (ll. Let L2^([0, T], M") c L'^{n x [0,T],c/P x dt,M") = L^{[0,T], L\n,W)) denote the 
space of Fr— adapted random processes {z{t) : t E [0,T]} such that 



which is a sub-Hilbert space of /.^([O, T], ^^(fi, M")). Similarly, let L|^([0, T], £(M'", M")) C 
L^([0, T], L^(f2, /^(M"*, M"))) denote the space of F^— adapted n x m matrix valued random 
processes : t E [0,r]} such that 
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A. Regular Strategies 

In this subsection we consider measurable vector valued functions, also known as regular 
strategies. We consider the strong formulation. Let ^f2,F, {Fq,* : t G [0,T]},pj denote a fixed 
complete filtered probability space on which are based all random processes considered in the 
paper. At this stage we do not specify how {Fq,* : t E [0,r]} came about, but we require that 
Brownian motions are adapted to this filtration. 

Admissible Decision Maker Strategies 

The Decision Makers (DM) {n* : i G Zjy} take values in a closed convex subset of linear 
metric spaces {{M\d) : i G Zn}. Let = {gi^ : t G [0,T]} C {Fq,* : t G [0,r]} denote the 
information available to DM i, \fi G Zjy. The admissible set of regular strategies is defined by 

Ut,^[0,T] = G L2^([0,T],M'^0 : u] e A' C R'^\ a.e.t G [0,T], P-a.s.}, Vz G Z^. (7) 

Clearly, l]i.^g[0,T] is a closed convex subset of L^^([0, T], R'^'), for z = 1,2, ...,iV. That is, 
u' : [0, T] X ^ A\ and {wj : t G [0, T]} is ^^^-adapted, Vi G Z^. 

An iV tuple of DM strategies is by definition {u\u'^, . . . G u15[0,T] = x^^^Ut^jO, T], 
which are nonanticipative with respect to the information structures {Got ■ ^ ^ [0,7"]},^ = 
1,2, ...,A^. Hence, the information structure of each DM, Q^, is decentralized, and may be 
generated by local or global subsystem observables. Nonanticipative strategies are often utilized 
when deriving the minimum principle for centralized stochastic control or decision systems 

Distributed Stochastic Systems 

Given a fixed probability space ^r2,F, {Fq,* : t G [0,T]},pj, a distributed stochastic system 
consists of an interconnection of subsystems. Each subsystem i has its own state space R"% 
action space A* C W^', an exogenous noise space = ]R"'% and an initial state x*(0) = Xq, 
identified by the following quantities. 

(51) a;*(0) = Xq. an M"' -valued Random Variable; 

(52) {W^t) : t G [0,T]}: an R™'-valued standard Brownian motion which models the 
exogenous state noise, adapted to Fy, independent of a;*(0). 
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Each subsystem is described by a finite dimensional system of coupled stochastic differential 
equations of Ito type as follows. 

N 

dx\t) =f{t,x\t),ui)dt + a\t,x\t),ui)dW\t)+ J2 f\t,x^{t),ui)dt 

N 

+ (^'^it,x\t),u{)dW^{t), x'{0)=xi, tG(0,T], ieZN. (8) 

On the product space (X(^), A^^), W^^)), where = x^^^R^^SAW = xf^^A\W^^^ = 
x^^W^\ one defines the augmented vectors by 

Then on the product space the distributed system is described in compact form by 

dx{t) = fit, x{t), ut)dt + ait, x{t), ut) dW{t), x{0) = Xq, te (0, T], (9) 

where / : [0, T] x M" x A^^) — > denotes the drift and a : [0, T] x x A^^) — > £(R™, M") 
the diffusion coefficients. Note that ^ is very general since no specific interconnection structure 
is assumed among the different subsystems. 

Pay-off Functional 

Consider the distributed system (|9]) with decentralized full information structures. Given a 
we define the reward or performance criterion by 

J(m) = J(M^M^...,M^) =e|^ i{t,x{t),Ut)dt + ^{x{T)y (10) 

where i : [0, T] x x U*^^) — > {—oo, oo] denotes the integrand for the running cost functional 
and if : — > (—00,00], the terminal cost function. Notice that the performance of the 
decentralized system is measured by a single pay-off functional. The interpretation is that there 
is a centralized layer where the quality of individual decision makers strategies are evaluated for 
a common goal. Therefore, the underlying assumption concerning the single pay-off instead of 
multiple pay-offs (one for each decision maker) is that the team objective can be met. 
For deterministic as well as stochastic systems, it is well known that if the set A* is not convex, 
there may not exist any optimal control. For this reason it is necessary to introduce relaxed 
strategies as discussed in the next subsection. 
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B. Relaxed Strategies 

This paper will focus on relaxed strategies (also called randomized strategies) and later on 
specialize to regular strategies (measurable functions). Therefore, we introduce the formulation 
based on relaxed strategies (e.g. probability measures on the action space). 

Distributed Stochastic Systems 

For each i E Z^v, let (M\d) be a separable metric space with A* C M* compact, and let 
i3(A*) denote the Borel subsets of A*. Let C(A*) denote the space of continuous functions 
on A*. Let A^(A*) denote the space of regular bounded signed Borel measures on B{A^) and 
A^i(A*) C A^(A*) the space of regular probability measures. The DM strategies with different 
information structures on the time interval [0,T] will be described through the topological 
dual of the Banach space ^^^([0, T], C(A*)), the L^-space of = {^^ ^ : t G [0,T]]}- 
adapted C(A*) valued functions, for i E TL^. For each % E "Ln the dual of this space is 
given by ([0, T], A^(A*)) which consists of weak* measurable Q'^ adapted A^(A*) valued 
functions. The DM (control) strategies are drawn from the subspace ([0, T], A^i(A*)) C 
([0, T], A^(A*)). For convenience notation we denote this by 

U:.,J0,T] = L^^([0,T],A^i(AO), I E Z^, (11) 

and the team strategies by the product space 

Thus, for any i E Z^r, given the information Q\., player {u\ : t G [0,T]} is a stochastic kernel 
(conditional distribution) defined by 

nj(r) = gl(r|^o,J' for ^ ^ [0,T], and VT G B{K'). 
Clearly, for each i E Zjy and for every E C(A*) the process 

Ja' Ja' 

is Q^— progressively measurable. Given a m G U^^''[0,T], the distributed system is written in 
compact form as 

dx{t) = f{t,x{t),ut)dt + a{t,x{t),Ut)dW{t), x{0) = Xq, tG[0,T], (12) 
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where the drift and diffusion coefficient is now defined by 

F{t,x,ut)= [ (b{t,x,e,e,---,^'')) y<iiui{de)dt, te[o,T), (U) 

Jaw V / 

forF = {/,a}, 
Pay-off Functional 

Given a m G U|,^^[0, T] the performance criterion is defined by 

J(m) = e|^ i{t,x{t),ut)dt + ip{xiT))'^ (14) 

^e!^£ J^^^^ (^i{t,x{t),e,e,---,e)) xi.uiidOdt+^ixiT))'^ (15) 

where £ and are as defined before. 

C. Team and Person-by-Person Optimality 

In this section we give the precise definitions of team and person-by-person (i.e., player- 
by-player) optimality for relaxed and regular strategies. There are many possible information 
structures for control strategies {n* : i G Zjy}. We consider the following. 

(NIS): Nonanticipative Information Structures. Decision is adapted to the filtration 
Q'rp C Ft- which is generated by the a— algebra induced by any combination of the subsystems 
Brownian motions and their increments {{W^{t),W'^{t), . . . {t)) : t G [0,T]},Vz G Z^. 
This is often called open loop information, and it is the one used in classical stochastic control 
with centralized full information to derive the maximum principe 

(FIS): Feedback Information Structures. Decision is adapted to the filtration gen- 
erated by the o-— algebra Qq\ = a{z\s) : < s < G [0,T], where the observables z'' are 
nonanticipative measurable functional of any combination of the states defined by 

z\t) = h\t,x), /i^ [0,T] X C([0,T],M") — ^M'^', i G Z^. (16) 
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Note that the state x and hence the observables may depend on controls. 
The set of admissible regular feedback strategies is defined by 

U(^/^ [0, T] = |m G Ui^^ [0, T] : is t - measurable, t e [0,T], 2 = 1, . . . , A^} . (17) 

Similarly, the set of admissible relaxed feedback strategies is defined by 

l]fJ''[0,T] t [u e ljfJ[0,T] : G L J. ([0, T], A^i(AO), z = 1,...,n}. (18) 

One might be tempted to believe that nonanticipative strategies might be restrictive, because they 
are not explicitly described in terms of feedback. We will show that this is not true. In fact such 
strategies cover a large number of interesting problems. 

Problem 1. (Team Optimality) 

(RS): Relaxed Strategies. Given the pay -ojf functional (O, constraint ([72]) the N tuple of 
relaxed strategies °) G U^^'*[0, T] is called nonanticipative team optimal 

if it satisfies 

J(ni'°,u2'°,...,u^'°) < J(n\n2,...,u^), = {u\u\ . . . ,u^) e ul^i\o,T] (19) 

Any u° G U^^''[0,T] satisfying rfiPl) is called an optimal relaxed decision strategy (or control) 
and the corresponding x°(-) = x(-;m°(-)) (satisfying f [72]) ) the optimal state process. 
Similarly, feedback team optimal strategies are defined with respect to u° G U^^^'^[0,T] 

(NRS): Regular Strategies. Regular nonanticipative team optimal strategies are defined 
with respect to pay-off (ITOl). constraint (19]), and u° G Ur^^[0,T], while feedback team optimal 
strategies are defined with respect to u° G Ur^^'^[0, T]. 

By definition. Problem [T] is a dynamic team problem with each DM having a different 
information structure (decentralized). To the best of the authors knowledge there seems to have 
been no attempt in the literature to address the Problem [1] An alternative approach to handle 
such problems with decentralized information structures is to restrict the definition of optimality 
to the so-called person-by-person (player-by-player) equilibrium. 
Define 

J{V, U-') = J{u\ . . . , u'-\v, U'^\ . . . , M^) 
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Problem 2. (Person-by-Person Optimality) 

(RS): Relaxed Strategies. Given the pay-off functional (O, constraint ([72]) the N tuple of 
relaxed strategies u° = {u^'°,u'^'°, . . . ,u^'") G U^,^'' [0, T] is called nonanticipative person-by- 
person optimal if it satisfies 

J{u''°, u-''°) = J{u°) < J{u\ u-''°), W G Ut.JO, T], yieZN. (20) 

Similarly, feedback person-by-person optimal strategies are defined with respect to u" G U^^^'^ [0,7"]. 

(NRS): Regular Strategies. Regular nonanticipative person-by-person optimal strategies are 
defined with respect to pay-off rfTOl). constraint (|9l), and u° G Ur^''[0,T], while feedback person- 
by-person optimal strategies are defined with respect to u° G Ur^'''^[0, T]. 

The interpretation of (EOl ) is that the variation and hence evaluation (of team optimality) is 
done by the central layer and it is this layer alone that can determine if the decision for the 
i-th player is optimal or not. Even for Problem |2] the authors of this paper are not aware of any 
publication which addresses necessary and/or sufficient conditions of optimality. Conditions (l20l) 
are analogous to the Nash equilibrium strategies of team games consisting of a single pay-off 
and DM. The person-by-person optimal strategy states that none of the N members (possibly 
with different information structures) can deviate unilaterally from the optimal strategy and gain 
by doing so. The rationale for the restriction to person-by- person optimal strategy is based on 
the fact that the actions of the DM are not communicated to each other, and hence they cannot 
do better than restricting attention to this optimal strategy. 

Problems [H [2] using relaxed strategies are the main problems addressed in this paper, while 
conclusions for regular strategies are drawn from these results. Clearly, any strategy which is 
optimal for Problem [T] is also a person-by-person optimal and hence optimal for Problem |2] 

in. Existence of Team Optimal Strategies 

As mentioned earlier, not every control problem admits optimal regular strategies. However, 
in many problems relaxed strategies exist under certain mild assumptions. In this section we use 
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a similar procedure as the one developed in [|36l for centralized information structures to prove 
(i) existence of solution of the distributed stochastic dynamical decision system (fT2l) . and (ii) 
existence of optimal relaxed strategies for the Problem [1] 

A generalized sequence m*'" G U*e;[0, T] is said to converge (in the weak* topology or) vaguely 
to written m*'" u''", if and only if for every G ([0, T], C(A^)) 

E / ipt{i)^t'^{di)dt — ^ E / ift{i)v^t\di)dt as a ^ oo, Vi G Z^v- 

i[0,T]xA^ i[0,T]xA' 

With respect to the vague (weak*) topology the set U*g;[0,T] is compact, and from here on we 
assume that U*g;[0,T],Vz G Z^r has been endowed with this vague topology. 

Let i?^([0, T], R")) denote the space of Fy-adapted valued second order random 
processes endowed with the norm topology || ■ || defined by 

II X |p= sup E|x(t)||„. 

To study the question of existence of solution to (fT2l) we use the following assumptions. 

Assumptions 1. The drift f and diffusion coefficients a associated with 4721) are defined by the 
Borel measurable maps: 

f : [0, T] X M" X A^^) — y W\ a : [0, T] x M" x A^^^ — > £(M'^, M") 

and they are continuous in the last two arguments and assumed to satisfy the following basic 
properties:. 

(AO) {A\d),\/i G Zat are compact. 
There exists a K E L^'~^{[0,T],M.) such that 

(Al) |/(t,x,0 - f{t,y,OWn < K{t)\x - y\^,^ uniformly in ^ G A^; 

(A2) |/(t,x,0|K" < K{t){l + |x|Rn) uniformly in ^ G A^^) 

(A3) \(7(t,x,^) - cr(t,y,Ol£(M'»,K") < K{t)\x - yiRn uniformly in ^ G A^^^- 

(A4) \o'{t,x,^)\c{R"^w^) < K{t){l + |x|]R") uniformly in ^ G A*^^); 

(A5) /(t, X, ■), c^{t, X, ■) are continuous in ^ G A^^), V(t, x) G [0, T] x M". 
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Assumptions [H (A1)-(A4) are the so-called Ito conditions for existence and uniqueness of strong 
solutions (having continuous sample paths) [[8]|. 

The following lemma proves the existence of solutions and their continuous dependence on the 
decision variables. 

Lemma 1. Suppose Assumptions [7] hold. Then for any W^ Q-measurable initial state Xq having 
finite second moment, and any u G U|.^^[0,T], the following hold. 

(1) System 472]) has a unique solution x G B^{[0,T], L'^{Q,W^)) having a continuous 
modification, that is, x G C([0, T], M"), F—a.s, Wi G Zjy. 

(2) The solution of system 4721) is continuously dependent on the control, in the sense that, 
as ^ u''° in U;,;[0,T], \/i G Zn, x° in B^^{[0,T], L'^{n,W)),\/i G Z^. 

These statements also hold for feedback strategies u G U^^^'^ [0, T]. 

Proof: Since the class of policies U*g;[0, T], Vi G Z^ is compact in the vague topology, then 
Xj^^U*gjO, T] is also compact in this topology. Utilizing this observation the proof is identical 
to that of ll36l . Lemma 3.1. 

■ 

Using the results of Lemma [T] in the next theorem we establish existence of a minimizer u° G 
U[,^^[0,T] for Problem [H We need the following assumptions. 

Assumptions 2. The functions i and ip associated with the pay-off f [7?l) are Borel measurable 
maps: 

I: [0,T] xM"xA(^) ^( — oo, +oo], ip : M" — y (— oo, +oo]. 

satisfying the following basic conditions: 

(Bl) X — )■ i(t,x,^) is continuous on MJ^ for each t G [0,T], uniformly with respect to 

e G AW; 

(B2) 3 he L+([0,r],M) such that for each t G [0,T], < h{t){l + 

(B3) X — > v{x) is lower semicontinuous on M" and 3 co,ci > such that \<p{x)\ < 

I 1 2 

Co + Cl|X|]jjn. 

Now we present the following existence theorem [|36l . 
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Theorem 1. (Existence of Team Optimal Strategies) Consider Problem\^and suppose Assump- 
tions \1} and \2\ hold. Then there exists a team decision u° = . . . G U^,^''[0,T] at 
which J{u^,u'^, . . . ,u^) attains its infimum. Existence also holds for u° G U^^^'^ [0;^]- 

Proof: Since the class of control policies U^;[0,T] is compact in the vague topology, it 
suffices to prove that J(-) is lower semicontinuous with respect to this topology. This follows 
precisely from the same procedure as in [|36l . Theorem 3.2. 

■ 

We conclude this section by stating that existence of team optimal strategies utilizing decen- 
tralized information structures follows directly from analogous results of centralized stochastic 
control strategies [[T3l . 



In this section we present the necessary and sufficient conditions of optimality for the team 
game of Problem [T] The derivation of stochastic minimum principle (necessary conditions of op- 
timality) or stochastic Pontryagin's minimum principle is based on the martingale representation 
approach. For this reason we shall fisrt state certain fundamental properties of semi martingales, 
which are used in the derivation. 

Definition 1. Let denote a complete filtration generated by an —dimensional Brownian 
motion process {W(t) : t G [0,T]}. An MJ^— valued random process {m(t) : t G [0,T]} is said 
to be a square integrable continuous Fj-— semi martingale if and only if it has a representation 



for some V G L^^{[0,T],W) and 11 G ^^^([O, T], £(M™, M")) and for some W -valued ¥o,o-measurable 
random variable m(0) having finite second moment. The set of all such semi martingales is 
denoted by SM'^[0,T]. 

We need the following class of F^-— semi martingales: 



IV. Optimality Conditions for Relaxed Strategies 




t G [0,T], 



(21) 



m G SM^[0,T] : m{t) = [ 



v{s)ds+ [ J:{s)dW{s), te[0,T], 




(22) 
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Now we present a fundamental result which is used in the derivation of minimum principle. 

Theorem 2. (Semi martingale Representation) The class of semi martingales SAd^^^T] is a 
real linear vector space and it is a Hilbert space with respect to the norm topology || m \\smI\o t] 
given by 

II ^ \\sMi[m= / Ht)\lM + ^ f tri^*mit))dty. 

^ J[0,T] J[0,T] ^ 

Moreover, the space iSA^q[0,T] is isometrically isomorphic to the space 

L2^([0, T], M") X L^^([0, T], £(M'", M")). 

Proof: For proof see Theorem 4.3 in [|36l . 

■ 

For the derivation of stochastic minimum principle of optimality we shall require stronger 
regularity conditions for the drift and diffusion coefficients {6, cr}, as well as, for the running 
and terminal pay-offs functions These are given below. 

Assumptions 3. E|x(0)|^„ < oo and the maps of {f,a,i,(p} satisfy the following conditions. 
(CI) The triple {/, cr, £} are measurable in t G [0,T]; 

(C2) The quadruple {f,cr,i,(p} are once continuously differentiable with respect to the state 
variable x G M"; 

(C3) The first derivatives of {/, cr} with respect to the state are bounded uniformly on 
[0,T] X R" X A(^). 

Consider the Gateaux derivative of a with respect to the variable at the point (t, 2;, u) G [0, T] x 
M" x^i A^i(A*) in the direction r/ G M" defined by 

(t, z,u]rf) = lim -la{t,z + erj,!/) — a(t, z,h')\, t G [0, T] . 

Note that the map t] — > ^^{t, z, u; 77) is linear, and it follows from Assumptions |3l (C3) that 
there exists a finite positive number (3 > such that 

\a^{t,z,u;r])\c{R^,Rn) < (3\r]\M,n, t G [0,T]. 

In order to present the necessary conditions of optimality we need the so called variational 
equation. Let us first introduce the variational equation for nonanticipative information struc- 
tures. Suppose u° = . . . , G U^.^^[0,r] denotes the optimal decision and u = 
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e UJ.2^[0,T] any other decision. Since U;^i[0,T] is convex Vz G Zat, it is clear 
that for any e G [0, 1], 

ui'' = + - G Ut,JO, T], Vz G Z^. 

Let x^(-) = x^(-;u^(-)) and = x°(-;u°(-)) G fi|^([0, T], L2(1], M")) denote the solutions of 
the system equation ([T2)) corresponding to m'(-) and respectively. Consider the limit 

=lim^|x"(t) -a;°(t)}, tG[0,T]. 

We have the following result characterizing the process {Z(t) : t G [0,T]}. 

Lemma 2. Suppose Assumptions\3\hold and consider nonanticipative strategies U^^'' [0, T]. The 

process {Z{t) : t E [Q,T]} as defined above is an element of the Banach space 

i?^([0, T], M")) and it is the unique solution of the variational stochastic differential 

equation 

dZ{t) = f^{t,x°{t),u°)Z{t)dt + a^{t,x°{t),u°;Z{t)) dW{t) 

N N 

+ J2 fit, «i - <ndt + ^(t, < - ur)dwit), z(o) = o. 

i=l i=l 

(23) 

having a continuous modification. 

Proof: We closely follow the steps in [33]. Writing the system (fT2]) as an integral equation 
with solutions corresponding to controls u'^.u" respectively and taking the difference 

—x°{t) and dividing by e and then letting e — > 0, it can be shown that it converges for all 
t G [0, T],F — a.s. to the solution of system (l23l) . Note that the system (123]) is a linear stochastic 
differential equation in Z with non homogeneous terms given by the sum of the last two terms. 
Let {z{t) : t G [0,T]} denote the solution of its homogenous part given by 

dz{t) = f^{t, x°{t),u°)z{t)dt + x°{t)y^°] z{t))dW{t), z{s) = C, ^ e [s, T]. (24) 

By Assumptions [3] and Lemma [U this system has a unique solution {z(t) : t G [s,T]} given by 

z{t) = ^{t,s)C, tG[s,T], 

where s),t G [s,T] is the random (F^— adapted) transition operator for the homogenous 
system. Since the derivatives of / and a with respect to the state are uniformly bounded, the 
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transition operator G [s,T] is uniformly P— a.s. bounded (with values in the space of 

n X n matrices). 

By Using the random transition operator \E' we can write the solution of the non homogenous 
stochastic differential equation (|23T ) as follows, 

Z(t)= [ ^(t,s)rfr7(s), te[0,T], (25) 

where {?7(t) : t G [0,T]} is the semi martingale given by the following Ito differential, 

N 

N 

+ ^a{t,x°{t),u-''",ui-ui'°) dW{t), r/(0) = 0, tG(0,T]. (26) 

i=l 

Note that {r]{t) : t E [0,T]} is a continuous square integrable F^— adapted semi martingale. 
The fact that it has continuous modification follows directly from the representation (|25T l and 
the continuity of the semi martingale {r]{t) : t E [0,T]}. ■ 
Clearly, the variational equation for nonanticipative strategies u|,^''[0, T] is obtained as in 
centralized control strategies found in [|36l . Next, we discuss the variational equation for feedback 
information structures. For u E U^,^'*'^ [0, T] the variational equation will also involve derivatives 
of u with respect to the state trajectory x, since such strategies utilize feedback. To avoid this 
technicality, we first address the question as to whether optimizing J(u) over nonanticipative 
information structures is the same as optimizing J(u) over feedback information structures. If 
this is the case then the variational equation for u E U^.^^'^" [0, T] will be that of m G U^^^ [0, T] . 
We shall require the following assumption. 

Assumptions 4. The following holds. 

(El) The diffusion coefficient a is independent of u and both cr(-, ■) and ■) are uni- 

formly bounded. 

Under the (additional) Assumptions |4] we can prove the following theorem. 

Theorem 3. Consider ProblemUlcind suppose AssumptionsUlcindMhold. Define the a— algebras 

j^x(fi},w A ^i^^o), W{s) : < s < t}, J^Sj = a{x^{s) : < s < t}, Vt G [0, T]. 
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Then for all u G U^^'''^ [0, T] the two a-algebras are equivalent written as an equality, J-'q f ■'''^ = 

j-o^;,vte [o,r]. 

Proof: Clearly, by Lemma [U we have C J'of^'^.Vn G U;.^^[0,T],t G [0,T]. By use 
of Assumptions |4] one can easily verify that J^o 1°'* ^ ^ *^oI'^^ ^ [0'^]- This completes the 
proof. ■ 
Under the conditions of Theorem |3l for any stochastic kernel {^^(r) = (lt{^\'3oT) ■ ^ ^ 
[0,T]} G U^J[0,T],r G B{A') which is ^^J" -measurable there exists a function adapted 
to a sub-(7-algebra of J^'^ t C J'oT'^ ^"'^h that nj(r) = g*(r|0^(t, x(0), A^, t^))), P - 
a.s,Vt G [0,T],i = 1,...A^. 

Let J"*. = {J^^Qt : t G [0,r]},^f" = {g^';'' ■. t e [0,T]},i = 1,...,Zn, and define all such 
adapted nonanticipative functions by 

C[0,T] = G L^, ([0,T],>li(AO) : G L ([0, T], >1i(A0)}, G Z^. (27) 

Next, we introduce the following additional assumptions. 

Assumptions 5. The following holds. 

(E2) U^^j" [0, T] Jen^e in Vl^^ [0, T] , Vi G Z^. 

Under the additional Assumptions |5] we can prove the following result. 

Theorem 4. Consider Problem\l]with control strategies from V^^i^'^ [0,^]. Under AssumptionsU] 

|2] \5\ and \(Px{x) + ixit,x,u)\M.^ < K{1 + |a;|iKn) we /?ave, 



inf J{u) = inf J(u). (28) 

Proof: The assertion is obvious because of the density assumption (E2) and the continuity 
of J in the vague topology. 



The point to be made regarding Theorem |4] is that if n G u|,^^''' [0, T] achieves the infimum 
of J{u) then it is also optimal with respect to U^^^[0,T] = xfLjT^^i[0,T]. Consequently, the 

(N) 

necessary conditions for feedback information structures u G U^ez i^' optimal are 

those for which nonanticipative information structures u G V^^i [0, T] are optimal. 
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In the next remark we give an example for which Assumptions |5] hold, and hence Theorem |4] is 
valid. 

Remark 1. Suppose and are governed by the following stochastic differential equations 

dx\t) =f{t,x\t),u\t))dt + a\t,x\t))dW\t), x\'d) = xl, (29) 
dx^it) =f{t,x\t),x\t),u\t),u\t))dt + a\t,x\t),x\t))dW\t), x\0) = xl, (30) 
z^t) =h\t,x\t)), z^{t) = h\t,x\t),x\t)), tG[0,T], (31) 
where h^^fi^ are measurable, W^{-),W'^{-) are independent, and G U^el [0,7'],m^ G 

2 ^ 

VXi ' [0, T]. If we further assume that ■)} and their inverses are bounded, then we can 

find U^g;[0,T],z = 1,2 for which (E2) holds, and thus Theorem^ holds. The structure of the 
stochastic dynamics d29l) . can be generalized to more than two coupled systems. 



Next, we introduce the following alternative theorem to Theorem IH which does not employ 
Assumptions [51 

Theorem 5. Consider Problem\J}with strategies from U^,^'*'^ [0,7"], under AssumptionsU} |2] \ni\ 
Mand \(fxix) + ix{t,x,u)\K'^ < K{1 + |x|Rn). 
Then l]f^[0,T] is dense in lT^^i[0,T],Wi G and 

inf J{u) = inf J{u). (32) 

Proof: The derivation is based on [|40l but extended to relaxed strategies. By Theorem [3l 
for any G U^g^"[0,T] which is ^f,' "— adapted we can define the set U^.JO, T],i = 1,...,N 
via (I27l). For any u G ul^^[0,T] = xfL^V'.jO^T], k = ^, and any test function G C(A(^)), 
define 

. . ^ ' V) mMdO for < t < /c «o G A(^) 

Uk,tm =< , nk ^. . . (33) 

k J{n-l)k 



^ Tn-Dk Ia(^) ^(O^^^^Ods for nk <t < {n + l)k, n = l,...,M -1. 



Clearly Uk G U^2^[0,T], and Wfc — ^ m in L^^{[0,T], MiiA'^^^)) in the weak star sense. We 
need to show that Uk G U'^^''^"'' [0, T]. Let Xk denote the trajectory corresponding to Uk, and 
J-'qj the (T— algebra generated by {xfc(s) : < s <t}. Define 

h{t)= / a{s,Xk{s))dW{t) = Xk{t) - x{^) - / /(s,a;fe(s),Ufe(s))cis, (34) 

JQ Jo 
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and 

W{t) = [ a{s,Xk{s))-'dh{s). (35) 
Jo 

Since Uk G lJ^.^i [0,T], the process hit) is J^o,t "i^^^^urable, for < t < k. Hence, 

•^of''^ = ^o1, 0<t<k. (36) 

Therefore, Mfc ^ is -Fq*— measurable for k < t < 2k. From the above equations it follows that 
(|36l) also holds for A; < t < 2A;, and by induction that J^o!f^'^ = J^oj^^t G [0,T]. Therefore, 
■u^ ^ is also (weak star) measurable with respect to J^q ^ . Hence , for any u] which is (weak star) 
measurable with respect to a nonanticipative functional = h\t, x) there exists a nonanticipative 
functional of {x(0),iy} which realizes it. By Theorem |4] the derivation is complete. ■ 

Before we prove the optimality conditions we define the Hamiltonian system of equations. 
The Hamiltonian is a real valued function 

H : [0,T] X M" X M" X £(M™,M") x 7Wi(A(^)) — y R 

given by 

m{t, e, C, M, u) = {fit, e, u),0+ tr{MMt, e, i^)) + ^{t, te [O, T]. (37) 

For any u G U^S^[0, T], the adjoint process is (tfj, Q) G ^^^([0, T], M") x ^^([0, T], /:(M"^, W)) 
satisfies the following backward stochastic differential equation 

dil){t) = -f*{t, x{t),ut)ip{t)dt - VQ{t)dt - x{t),ut)dt + Q{t)dW{t), t G [0, T), 

= -e,(t, x(t), ^A(t), Q(t), ut)dt + Q{t)dW{t), (38) 

V'(T) = M<T)) (39) 

where l^g G L|^([0, T], M") is given by {VQ{t),C) = tr{Q*{t)a,{t,x{t),ut;C)),t G [0,T] (e.g., 
Vqit) = YJi^^^ [a^^\t,x{t),Ut)yQ^^\t), t G [0,T], a^^) is the kth column of a, ai''^ is the 
derivative of a^''^ with respect to the state, for k = 1,2, ... ,m, Q^^'^ is the kth column of Q). 
In terms of the Hamiltonian, the state process satisfies the stochastic differential equation 

dx{t) = fit, x{t),ut)dt + ait, x{t),Ut)dW{t), t G (0, T], 

= W^{t, x{t), i){t), Q{t), ut)dt + a{t, x{t),ut)dW{t), (40) 

x(0) = xo (41) 
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A. Necessary Conditions of Optimality 

In this section we state and prove the necessary conditions for team optimality. Specifically, 
given thatM°eU^^^[0,T] or u e Ui^^^'^"[0,r] is team optimal, we show that it leads naturally 
to the Hamiltonian system of equations (called necessary conditions). The derivation is based 
on the semi martingale representation as in [|36l with some modifications necessary to admit 
decentralized strategies adapted to an arbitrary filtration. 



In the following theorem we present the necessary conditions of optimality for Problem \T\ 

Theorem 6. (Necessary conditions for team optimality) Consider Problem [7] under Assump- 
tions |2] 121 

(I) Suppose Ft = cr{x(0), W{t),t G [0, T]} and U^^^[0, T] is the class of relaxed controls 
adapted to this filtration. For an element u° G U^^^[0,T] with the corresponding 
solution x° G -B^([0, T], L^(f2, M")) to be team optimal, it is necessary that the 
following conditions hold. 

(1) There exists a semi martingale m° G iSA^q[0,T] with the intensity process 
{r.Q°) e L2^([0,T],M") X L2^([0,T],£(M™,M")). 

(2) The processes ip", Q°} satisfy the inequality : 

N T 

/ H(t,x°(t)V^°(t),Q°(t),V'°,«i-«r)^^>0, V«GUi^)[0,T]. 
i=i -^0 

(42) 

(3) The process Q°) is the unique solution of the backward stochastic differ- 
ential equation diSl) . d39l) and that, for Qq ^ C Fo,t, the control u° G U^^'' [0, T] 
satisfies the point wise almost sure inequalities. 



E|e(t, x°(t), ^"(t), Q°(t), ^l\Qlt] > E|H(t, x"{t),r{t), Q''{t),u°)\gi, 

\fu' G MiiAJ),a.e.t G [0, T], P|g. ^ - a.s., 2 = 1,2,...,N. (43) 

(II) Suppose Wt is as above, and the Assumption\5\holds. For an element u° G U^^^'^ [0, T] 
with the corresponding solution x° G B^{[0,T], L^{Q,W^)) to be team optimal, it is 
necessary that the statements of Part (I) hold with Qq^ replaced by Q^'^ G [0,T]. 
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Proof: The derivation of (1), (2) follows closely the basic steps of centralized strategies in 
[|36ll . from which the derivation of team necessary conditions of optimality (3) are established. 
(I). (1) Suppose M° G U;,^)[0,r] is an optimal team decision and u e U^^^[0,T] any other 
admissible decision. Since I[J*gjO,T] is convex Vz G Z^v, we have, for any e E [0,1], u]'^ = 
v^" + e{ul - iti°) e Ut,j[0,T],V2 G Zn- Let x^(-) = x'{-,u'{-)),x°{-) = e 
i?f;^([0,T],L^(fi,]R")) denote the solutions of the system (O and dM]) corresponding to n^(-) 
and u°{-), respectively. Since u"{-) G U|,^''[0,T] is optimal it is clear that 

Jiu')-J{u")>0, VeG [0,1], VmgU^^^[0,T]. (44) 

Define the Gateaux differential of J at u° in the direction u — u° hy 

dJ(u° u-u ) = lim ^ — ^ — - = —Jiu) L=o. 

£4-0 e de 

Dividing the expression (|44l) by e and letting e ^ we obtain 

N T 

dJ{u\u-u^) = L{Z) + y2^ I ^{t,x"{t),u-''",ui-ui'°)dt>0, Vm G uj.^^ [0, T], (45) 

i=i 

where L{Z) is given by the functional 

L{Z)=Ey^ (4(t,x°(t),M°),Z(t)) rft+(v.,.(x°(r)),Z(r))|. (46) 

Since by Lemma [2l the process Z{-) G i?^([0, T], R")) and it is also continuous P— a.s 
it follows from Assumptions |2l (B2), and Assumptions [3l that Z — > L{Z) is a continuous 
linear functional. Further, by Lemma |2l t] — > Z is a continuous linear map from the Hilbert 
space SMl[0,T] to the B-space 5^([0, T], ^^(fi, M")) given by the expression Thus the 
composition map t] — > Z — > L{Z) = L{r]) is a continuous linear functional on SAil[0,T]. 
Then by virtue of Riesz representation theorem for Hilbert spaces, there exists a semi martingale 
m° G SMl[0,T] with intensity Q") G ^^^([O, T], M") x ^^^([0, T], £(M™, M")) such that 

L{Z) = ~m = (m°, v)sMl[o,T] = E ^ / (^°(^)' ^"W. «t - ^T))dt 

i=i 

TV T 

+ Ve / tr(g°'*(t)(T(t,x°(t),M-^'°,M^-Mn)c/t. (47) 

This proves (1). 
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(2) Substituting P7l) into (l45l) we obtain the following variational equation 

dJ{u°, M - M°) = ^ E / {i)°{t), fit, ui - ui'°))dt 

i=i Jo 

N 



+ VE / tr{Q"'*{t)a{t,x"{t), 
i=i JO 

N T 

+ Ve/ l{t,x°{t),u-''°y^-v^^°)dt>Q, Vm e U^2^[0,T]. (48) 
i=i 

It follows from the definition of the Hamiltonian that the inequality (|48|) is precisely (|42|) along 
with the pair Q°(t)) : t E [0,T]}. This completes the proof of (2). 

(3) Next, we prove that the pair {{ip°{t),Q°{t)) : t E [0,r]}) is given by the solution of the 
adjoint equations (l38l) . (|39| ). Computing the Ito differential of the scalar product {Z, -ip") and 
integrating this over [0,T], it follows from the variational equation (|23T ) that 



E{z{T),riT)) =E{ j\z{t)j:{t,x\t)x)r{t)dt + a:{t,x''x-.r)dw{t) + dr{t)) 

'rv^,-v^n.rit))dt 



i=i 

N „T i-T 

+ V / ((T*(t, x°{t),u-''°, u\ - uY)^°{t), dW{t)) + <dZ, di)° > (t) 
,_i Jo Jo 



(49) 



Ey\z{t),f:{t,x'^{t),ui)r{t)dt+dr{t)) 
M-''°,ui-ur),r{t))dt 



i=l "^0 



T 

< dZ,dip" > {t)\, (50) 







where the last bracket < ■ > in each of the above expressions is the quadratic variation between 
the two processes, and the stochastic integrals in (1491 ) have zero expectation giving (ISOl ). Since Ito 
derivatives of the variation process {Z(t) : t E [0, T]} and the adjoint process {i'°{t) : t E [0, T]} 
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have the form 

dZ{t) = bounded variation terms + a^{t, x°{t), Z{t))dW{t) 

N 

+ Y,^it,^"it),u;'''',ul-ur)dW{t), Z{0) = 0, te{0,T], (51) 
1=1 

dip"{t) = bounded variation terms + Q°{t)dW{t), ij°{T) = v9^(x°(T)), (52) 
their quadratic variation is given by 

eJ KdZ.diP'' > {t)=E^ j tr(g°'V^(t,x''(t),<;Z(t)))c/t| 

+ VeM tr{Q°^*{t)a{t,x\t),u-"'"yt-uT))dt\- (53) 

The first term on the right hand side of the above expression is linear in Z, hence there exists 
a process {VQo{t) : t G [0,T]}, given by the following expression 

{VQo{t),Z{t))=tr{Q°^\t)a,{t,x°{t)X\Z{t)))- (54) 

By Assumptions [3l a has uniformly bounded spatial first derivative and it follows from the semi 
martingale representation that Q" G L^^([0, T], /:(R™, M")) and hence Vqo g L^^([0, T], M"). 
Substituting dSl]) into ([53]) and ([53]) into dSO]), we obtain 

nz{T),r{T)) = Ey\zit), flit, x\t)x)rdt + - Q^mwit) + dnt)) 

+ {f{t,x''{t),ur'",ui-ur),r{t))dt 
i=i ^-^0 



+ tr{Q°'*{t)a{t, x°{t), u;''", u\ - uf))dty (55) 

Thus, by setting 

d^°{t) = -fl{t,x"{t),u°)ilj°{t)dt - VQo{t)dt + Q°{t)dW{t) - i^{t,x°{t),u"t)dt, t G [0,T) 

(56) 

r{T) = M^"iT)), (57) 
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it follows from (1551) and the expression for the functional L(-) given by (|46l) that 

L{Z) = e| {Z{T),r{T)) + j\z{t),Ut, 

N T 

= ^ e{ / {fit, - n^), + tr(Q°'V(t, - • 

(58) 

Substituting (l58l) into (l45l) we again obtain (|42|) . as expected. This is precisely what was obtained 
by the semi martingale argument giving (|47T ). Thus the pair {(^°(t), : t E [0,T]} must 

satisfy the backward stochastic differential equation (|56l) . (|57] ). which is precisely the adjoint 
equation given by (l38l) . (|39| ). Since t/j" satisfies the stochastic differential equation and T is 
finite, it follows from the classical theory of Ito differential equations that 'tp" is actually an 
element of 5|^([0, T], ^^(fi, M")) c L|^([0, T], M"). In other words, is more regular than 
predicted by semi martingale theory. Hence, by our Assumptions on a it is easy to verify that 
a:{t,x"{t),u1;r{t)) e L|^([0,T],/:(M'",M")) and 

a*(t, x°(t), V^ - uinrit) e Ll^{[0, T],W), z = 1, . . . , iV. 

This proves the first part of (3). 

Now we show (l43l) . Write (|42|) in terms of the Hamiltonian as follows. 



N „T 



f Il{t,x°{t),r{t),Q''{t),u;''",ul-uindt]>0, VMGU^^^iCT], (59) 
i=i 



where the triple is the unique solution of the Hamiltonian system (l38l) . (l39l) . (l40l) . 

(|4T]) . By using the property of conditional expectation then 

Let t G (0, T), w G and £ > 0, and consider the sets = [t,t + e]c [0, T] and fi^(c 1)) G ^^^^ 
containing to such that |/*| — and — t- as £ — 0, for i = 1, 2, . . . , A^. For any sub-sigma 
algebra ^ C F, let P|g denote the restriction of the probability measure P on to the cr-algebra 
Q. For any (vaguely) ^q^— adapted z/* G A^i(A*), construct 

{u' for (t, u)eRx Qi 
i = l,2,...,N. (61) 
u]'" otherwise 
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Clearly, it follows from the above construction that E U*g;[0,T]. Substituting (|6TI) in (l60l) we 
obtain the following inequality 

N „ 

W e Mi{A'),a.e.t e [0, T], P|g, ^ - a.s., i = l,2,...,N. 

(62) 

Letting |J*| denote the Lebesgue measure of the set J* and dividing the above expression by the 
product measure P(f2*)|/*| and letting £ — )■ we arrive at the following inequality. 

N N 

E{H(t, x%t), r{t), Q"{t), ur'", u^)\gi,] > J2 ^{nt, x°it),rit), Q°(t), ^r)K,t}, 

i=l i=l 

W e Mi{A'),a.e.t e [0,T],P|g,^ -a.s.,z = 1,2,...,N. (63) 
To complete the proof of (3) define 

g\t,u) = E[M{t,x°{t),r{t),Q°{t),u;''°,u' -uiniGlt}, te[0,T], WzeZ^. (64) 
We shall show that 

g\t,u)>0, W e Mi{A'), a.e.t e[0,T], P|g. ^ - a.s., Vz G Z^. (65) 

Suppose for some i G Z^, (l65l) does not hold, and let = {{t,u) : g^{t,u) < 0}. Since g^{t) 
is ^0 t~i^s^surable Vt G [0,T] we can choose u* in (|63]) as 



A I z/ on A* 



outside A' 

together with u-j. = u^'" ,j i,j G Zat. Substituting this in (l63l) we arrive at J^^ (7*(t, a;)(is(iP > 0, 
which contradicts the definition of A^, unless A* has measure zero. Hence, d65l) holds which is 
precisely (|43T ). This completes Part (I). 

(II). By Theorem |4] the necessary conditions for team optimality satisfy those in Part (I) with 
Qit replaced by 



The following remark helps identifying the martingale term in the adjoint process. 
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Remark 2. The arguments in the derivation of Theorem |3| involving the Riesz representation 
theorem for Hilbert space martingales, determine the martingale term of the adjoint process 
Mt = Jq ip°{s)a{s, x°{s),u°)dW{s), dual to the first martingale term in the variational equation 
(221), provided ipx{-) exists (i.e., fxx,o'xx,^xx,'^xx exist and are uniformly bounded). Hence, Q 
in the adjoint equation is identified as Q(t) = ilJx(t)a{t,x{t),Ut). When the diffusion term 
cr(-, ■, ■) is independent of x, given by a{t, u), then since (VQ(t), Q = tr{Q*{t)ax(t, x, uf, Q) ^e 
have VQ(t) = 0, Vt G [0, T] (e,g., the spatial derivative of the diffusion term is zero). 

It is interesting to note that the necessary conditions, for a G U^^'' [0, T] or u" G U^^'''^ [0, T] 
to be a person-by-person optimal policy, can be derived following similar steps as given in 
Theorem |6l and that these necessary conditions are the same as the necessary conditions for the 
team optimal strategy. This is stated as a Corollary. 

Corollary 1. (Necessary conditions for person-by-person optimality) Consider Problem^under 
Assumptions^ \3} Under the conditions of Theorem^ Part (I), for an element u° G U|,^''[0,T] 
with the corresponding solution x° G i?^([0, T], R")) to be a person-by-person optimal 
strategy, it is necessary that statements (1), (3) of Theorem ^ and Part 1, with statement (2) 
replaced by 

e[ m{t,x°{t),ij"{t),Q°{t),u;''°,ui-ui'°)dt>0, Vu^ G Ut,JO,T], Vi G Zjv. (66) 
Jo 

hold. Similar conclusions hold for strategies U^^^'^ [Oj^]- 

Proof: Primarily, the derivation is based on the same procedure as that of Theorem |6l The 
only difference is, that in this case, the variations of the DM policies are carried out in the 
direction of individual members while the rest of the members carry optimal policy. 

■ 

Clearly, every team optimal strategy for Problem [His a person-by-person optimal strategy for 
Problem [2l Hence person-by-person optimality is weaker than team optimality. By comparing 
the statements of Theorem |6] and Corollary [1] it is clear that statements (1) and (3) coincide, 
while the only difference are the variational inequalities (|42l) and (l66l) . However, (l66l) implies 
(|42l) . and it can be shown that (l42l) implies (l66l) . Indeed, if (l66l) is violated for some j G Z^v 
then by choosing all other = Vz G Zjv,z ^ j, the right side of (|42] ) will be negative. 
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which is a contradiction. This observation is new, and has not been documented in the static 
team game literature [|23l . 

Remark 3. From the above necessary conditions one can deduce the necessary conditions for 
full centralized information and partial centralized information. We state these conditions below. 

(1) Centralized Full Information Structures. Consider Problem [7] under the conditions of 
Theorem^ Part (I), and assume are adapted to F^n, Vi G Z^r. The necessary conditions are 
given by the following point wise almost sure inequalities 

V/i G 7Wi(A(^)), a.e.te [0,T], P - a.s., (67) 

where {x°{t),ilj°{t), Q°{t) : t G [0, T]} are the solutions of the Hamiltonian system POl) . ffTd . 
(EH), (EPl). This corresponds to the classical case SElJ. 

Moreover, if the strategies are based on centralized state feedback information, that is, n* are 
adapted to the information Q?f ^Mi G Zat, then under the conditions of Theorem^ Part (II) the 
previous optimality conditions are replaced by 

E{H(t, x\t), r{t), Q''{t),iiWo°t} > x°(t), r{t), Q°{t), u°w,;}, 

^fi G A^i(A(^)), a.e.t G [0, T],F\g.o - a.s. (68) 

(2) Centralized Partial Information Structures. Consider Problem\l\under the conditions of 
Theorem |^ Part (I) and Part (II) and suppose that each is adapted to the centralized partial 
information Qt C ¥t, and C T^o T' respectively. Then the necessary condition is given by 

E{H(t, x°{t),r{t), Q°(t), /i) |/Co,t } > E|H(t, x°(t), 7A(t), g(t), m°) |/Co,t } , 

V/i G A^i(A(^)), a.e.t G [0, T], P|/Co,, - a.s. (69) 

where A^o,t sub-sigma algebra of any of the sigma algebras indicated above. 

Finally, we mention two important results derived in [|36l which have direct extensions to 
the current paper. The first addresses existence of measurable relaxed team optimal strategy 
associated with the minimization of the Hamiltonian, and the second addresses existence of 
realizable relaxed strategies by regular strategies. 
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B. Sufficient Conditions of Optimality 

In this section, we show that the necessary conditions of optimality (|43l) are also sufficient 
under certain convexity conditions. 

Theorem 7. (Sufficient conditions for team optimality) Consider Problem {1} and suppose As- 
sumptions^ \3\hold. Under the conditions of Theorem^ Part (I), let («"(■), denote any 
control-state pair (decision-state) and let ip°{-) the corresponding adjoint processes. 
Suppose the following conditions hold: 

(C4) H(t, ■X,M,u),te [0, T] is convex in ^ e M"; 

(C5) (/?(•) is convex in G M". 
Then (u° {■) , x° {■)) is team optimal if it satisfies l^43\) . In other words, necessary conditions are 
also sufficient. For feedback strategies U^^^'^ [0, T] the same statement holds under the conditions 
of Theorem ^ Part (II). 

Proof: We shall prove the sufficiency under the conditions of Theorem [6l (I), that is, the 
admissible strategies u|,^''[0, T], since the derivation is precisely the same for the case Part (II). 
Let u" e U|,^''[0,r] denote a candidate for the optimal team decision and u G U^^^[0,T] any 
other decision. Then 

JK) - J(n) = (£(t,x°(t),<) -£(t,x(t),ni))dt+ ((^(x°(T)) - <^(a:(T)))|. (70) 

By the convexity of Lp{-) then 

^(xiT)) - cp{x"iT)) > {cp,ix%T)),xiT) - x%T)). (71) 
Substituting dZB into dTO]) yields 

JK) - J{u) < e[{^,{x°{T)),x"{T) - x{T))} 

+e|^ (^i{t,x°{t),u°) - i{t,x{t),ut)^dty (72) 
Applying the Ito differential rule to {'ijj°,x — x°) on the interval [0, T] and then taking expecation 



February 15, 2013 



DRAFT 



33 



we obtain the following equation. 

E\{r{T),x{T)-x"{T))] = E[{r{0),x{0) - x°(0)) 

-f:{t, x°{t), ui)r{t)dt - VQo{t) - i,{t, x^t), ui), x{t) - x°{t))dt] 
{r{t),f{t,x{t),ut) - f{t,x°{t),u°))dt] 



+ E<f 





+ e| ^ triQ*^"{t)a{t, x{t), ut) - Q*'"{t)a{t, x°{t), 
= -e| ^ (H,(t, x'^it), r{t), Q"{t), u1), x{t) - x\t))dt 
+ e{ ^ {r{t), fit, x{t), Ut) - fit, x\t), u1))dt 
+ e{ ^ tr(g*'°(t)a(t, xit),ut) - g*'"(t)(T(t, (73) 



Note that V^°(T) = ip^{x°iT)). Substituting dTS]) into ^ we obtain 



T 







Jiu") - Jiu) <E / e(t, x\t), rit), Q"it), u1) - H(t, rit), Q"it), u,) 



dt 



T 



-E| J {m,{t,x''it),rit),Q"it),u1),x"it) - xit))dty (74) 

Since by hypothesis HI is convex in ^ G M" and linear in G A^i(A'^^)), H is convex in both 
u) G M" X A^i(A(^)). Using this fact in dUl) we readily obtain 

jK)-j(m)<e / <m{t,x°it),rit),Q"it),-),u"ti-) - uti-) > dt <o, Wuevi^i\o,T], 

Jo 

(75) 

where the last inequality follows from (|43T ). This proves that u° optimal and hence the necessary 
conditions are also sufficient. 

■ 

Under conditions similar to those of Theorem |7] we can verify that a strategy is person- 
by-person optimal for Problem |2] if it satisfies (|43T l; this is stated as a corollary. Indeed, the 
necessary conditions for team optimality and person-by-person optimality are equivalent, and 
person-by-person optimality implies team optimality. 

Theorem 8. (Sufficient conditions for person-by-person optimality) Consider Problem^ and 
suppose Assumptions El \3\hold. Under the conditions of Theorem^ Part (I), let (m°(-), x°(-)) 
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denote any control-state pair and let the corresponding adjoint processes. 
Suppose the conditions of Theorem [2 (C4), (C5) hold. 
Then {u° {■) , x° {■)) is player-by-player optimal if it satisfies 

For feedback strategies U^^/' [0, TJ the above statements hold under the conditions of Theo- 
rem 1^ Part (II). 

Proof: The proof is similar to that of Theorem |7] ■ 

V. Optimality Conditions for Regular Strategies 

In the development of the necessary and sufficient conditions of optimality given in the 
previous section we have given conditions which assert the existence of optimal decisions from 
the class of relaxed decisions i]^fJ[Q,T] and uj.^^'^"[0, T] in Theorem [U 

The main observation of this section is that, if optimal regular decisions exist from the admissible 
class Ul5[0, T] c Ul,2^[0, T] (or the feedback class) then the necessary and sufficient conditions 
of Theorem |6] and Theorem |7] can be specialized to the class of decision strategies which are 
simply Dirac measures concentrated {m° : t e [0, T]} e uS[0, T] or Ul5'^"[0, T]. The important 
advantage of the theory of relaxed controls is that the necessary conditions of optimality for 
ordinary controls follow readily from those of relaxed controls without requiring differentiability 
of the Hamiltonian or equivalently the drift and the diffusion coefficients /, a with respect to 
the control variables. 

Thus we simply state the necessary and sufficient conditions of optimality for regular decentral- 
ized decision strategies which follow as a corollary of Theorem |6l |7] by simply specializing to 
regular decision strategies given by Dirac measures along the regular decision strategies leading 
to the following Hamiltonian 

n : [0, T] X M" X X £(M™, W) x A^^) — y M, 

where 

Hit, C, M, u) = {fit, u),0 + tr(MV(t, i^)) + iit, te [0, T]. (76) 

Theorem 9. (Regular team optimality conditions) Consider Problem Ul under the Assumptions 
of Theorem^with decisions (or controls) from the regular class taking values in A*, a closed, 
bounded and convex subset o/M*, \fi ^ "Z^. 
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(I) Let ¥t denote the filtration generated by x(0) and the Brownian motion W. 
Necessary Conditions. For an element u° G vi^g [0, T] with the corresponding solution 
x° G -B^([0, T], M")) to be team optimal, it is necessary that the following hold. 

(1) There exists a semi martingale m° G SM.1[Q,T] with the intensity process 

{r.Q°) e L2^([0,r],M") X L2^([0,T],£(M™,M")). 

(2) The variational inequality is satisfied: 

N „T 

^ e{ / {n{t, x°{t), r{t), Q"{t), ui, v°) - n{t, x%t), r{t), Q^it), <)) dt] 

\/ueUi^^[0,T]. (77) 

(3) The process (?/^°, Q°) G L|^([0, T], R") x L^^([0, T], £(M"^, M")) is a unique 
solution of the backward stochastic differential equation (|2Sl), d39l) . wzY/z HI 
replaced by H such that u° G Ur^^[0,T] satisfies the point wise almost 
sure inequalities with respect to the a-algebras Q^^ C Fo,f, t G [0,T],i = 
1,2,. ..,iV: 

e| {uit, x"{t),r{t), Q°{t)y,, v°) - Hit, x\t),rit), Q\t)X)) \Glt] > 0, 

Vn'' G A\ a.e.t G [0, T], P|g. ^ - a.s., z = 1, 2, . . . , iV. (78) 

Sufficient Conditions. Let (n°(-), x°(-)) denote an admissible decision and 
state pair and ip°{-) the corresponding adjoint processes. 
Suppose the conditions (C4), (C5) holds and in addition 
(C6) ^, C, M, ■), t G [0, T], is convex in u G 

r/zen (x°(-), n°(-)) optimal if it satisfies ([TS]). 

(II) Suppose z'j' f/ze filtration generated by x(0) anJ f/ze Brownian motion W, and 
Assumptions |5] hold with decision policies from the regular class. The necessary and 
sufficient conditions for a feedback policy u° G Vreg [0, T] to be optimal are given 
by the statements under Part (I) with Qq^ replaced by Q^^t ^'it G [0,T]. 

Proof: Follows from Theorem |6l |7] by simply replacing relaxed controls by Dirac measures 
concentrated at : t G [0,T]} G uS[0,T] or uS'^"[0,T]. 
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Person-by-person optimality conditions for regular decision strategies follow from their relaxed 
counterparts, as discussed above. Therefore we simply state the results as a corollary. 

Corollary 2. (Person-by-person optimality) Consider Problem. [2] under the conditions of The- 
orem |9] Then the necessary and sufficient conditions of Theorem |9] hold with the variational 
inequality ([73) replaced by 

e{ j^^ (nit, x°{t), r{t), Q"{t), ui, u^n 

-'H{t,x"{t),r{t),Q"it),ur,uT'^°)ytj > 0, W e u;,jo,r], g z^. (79) 

Similar conclusions hold for strategies U^gJ[0,T]. 

Proof: Follows from Corollary |2]by simply replacing relaxed controls by Dirac measures 
concentrated at {u° : t G [0,T]} e uS[0,T] or uS'^"[0,T]. 

■ 

The optimality conditions are derived based on the assumption that the filtration Ft is gen- 
erated by the system Brownian motions : t E [0,T]}. When this condition does not hold 
the optimality conditions are slightly modified as discussed in the next remark. 

Remark 4. Suppose ¥t is not generated by Brownian motions {W(t) : t E [0,T]} but stochastic 
integrals with respect to W{-) are W-r— martingales. Then by invoking the variation of the semi 
martingale representation due to Kunita-Watanabe (for the derivation see /l?7]/) we have the 
following. If (i) : L^(f2,F, P) is separable and (ii): is right continuous having left limits, 
then any square integrable martingale has the decomposition 

m(t) = m(0)+ [ v{s)ds+ [ J:{s)dW{s) + M{t), t e [0,T], (80) 
Jo Jo 

for some V G L|^([0, T], R"), S G L|^([0, T], £(M™, M")), W -valued ¥o,o-measurable random 
variable m(0) having finite second moment, and {M{t) : t G [0,T]} right continuous square 
integrable ¥t martingale, which is orthogonal to {W{t) : t G [0,T]}. This representation 
is unique. Further, the stochastic integrals J^T.(s)dW{s) and Jl^T(^s)dM{s) are orthogonal 
martingales for L2 integrands. In this case the adjoint equation given by (ESI), (ES]) is replaced 
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by 

dip{t) = - m^{t, x{t), ij{t), Q{t), ut)dt + Q{t)dW{t) + dM{t), t e [0, T) (81) 
^(T) =^,{x{T)). (82) 

In view of the results obtained, we confirm that there are no limitations in applying classical 
theory of optimization to decentralized systems. Rather, the challenge is in the implementation 
of the new variational Hamiltonians and the computation the optimal strategies for specific 
examples. In Part II ll38l of this two-part paper, we shall apply these optimality conditions to 
investigate various linear and nonlinear distributed stochastic team games and obtain closed form 
expressions for the optimal strategies for some of them. 

VI. Conclusions and Future Work 

In this paper we have considered team games for distributed stochastic dynamical decision 
systems, with decentralized noiseless information patterns for each DM, under relaxed and 
deterministic strategies. Necessary and sufficient optimality conditions with respect to team 
optimality and person-by-person optimality criteria are derived, based on Stochastic Pontryagin's 
minimum principle, while we also discussed existence of the optimal strategies. 
The methodology is very general, and applicable to many areas. However, several additional 
issues remain to be investigated. Below, we provide a short list. 

(Fl) For team games with regular strategies and non-convex action spaces A\i = 1,2, . . . , N, 
if the diffusion coefficients depend on the decision variables then it is necessary to derive 
optimality conditions based on second-order variations. The methodology presented to 
derive the necessary conditions of optimality can be easily extended to cover this case 
as well. 

(F2) The derivation of optimality conditions can be used in other type of games such as 
Nash-equilibrium games with decentralized information structures for each DM, and 
minimax games. 

(F3) The optimality conditions can be extended to distributed stochastic dynamical decision 
systems driven by both continuous Brownian motion processes and jump processes, 
such as Levy or Poisson jump processes, by following the procedure of centralized 
strategies in ll36l . 
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(F4) The optimality conditions can be applied to specific examples with decentralized noise- 
less information structures. Some of these are presented in the companion paper [38J. 

(F5) The methodology can be extended to cover decentralized partial (noisy) information 
structures. 
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